Online L1-Dictionary Learning with Application to Novel Document Detection

نویسندگان

  • Shiva Prasad Kasiviswanathan
  • Huahua Wang
  • Arindam Banerjee
  • Prem Melville
چکیده

Given their pervasive use, social media, such as Twitter, have become a leading source of breaking news. A key task in the automated identification of such news is the detection of novel documents from a voluminous stream of text documents in a scalable manner. Motivated by this challenge, we introduce the problem of online `1-dictionary learning where unlike traditional dictionary learning, which uses squared loss, the `1-penalty is used for measuring the reconstruction error. We present an efficient online algorithm for this problem based on alternating directions method of multipliers, and establish a sublinear regret bound for this algorithm. Empirical results on news-stream and Twitter data, shows that this online `1-dictionary learning algorithm for novel document detection gives more than an order of magnitude speedup over the previously known batch algorithm, without any significant loss in quality of results.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast online L1-dictionary learning algorithms for novel document detection

OnlineL1-dictionary learning, introduced by Kasiviswanathan et al. [1], is the process of generating a sequence of (dictionary) matrices {At+1}, one at a time, for t = 0, 1, . . .. After committing to At+1, a pair of matrices (Pt+1, Xt+1) is revealed and the online algorithm incurs a cost of ‖Pt+1 − At+1Xt+1‖1. The goal of the online algorithm is to ensure that the total cost up to each time is...

متن کامل

Novel Document Detection using Online L1-Dictionary Learning

Given their pervasive use, social media, such as Twitter, have become a leading source of breaking news. A key task in the automated identification of such news is the detection of novel documents from a voluminous stream of text documents in a robust and scalable manner. In this paper, we introduce an approach for novel document detection based on online dictionary learning. Unlike traditional...

متن کامل

A Novel Face Detection Method Based on Over-complete Incoherent Dictionary Learning

In this paper, face detection problem is considered using the concepts of compressive sensing technique. This technique includes dictionary learning procedure and sparse coding method to represent the structural content of input images. In the proposed method, dictionaries are learned in such a way that the trained models have the least degree of coherence to each other. The novelty of the prop...

متن کامل

Online `1-Dictionary Learning with Application to Novel Document Detection

Given their pervasive use, social media, such as Twitter, have become a leading source of breaking news. A key task in the automated identification of such news is the detection of novel documents from a voluminous stream of text documents in a scalable manner. Motivated by this challenge, we introduce the problem of online `1-dictionary learning where unlike traditional dictionary learning, wh...

متن کامل

Novel document detection for massive data streams using distributed dictionary learning

Given the high volume of content being generated online, it becomes necessary to employ automated techniques to separate out the documents belonging to novel topics from the background discussion, in a robust and scalable manner (with respect to the size of the document set). We present a solution to this challenge based on sparse coding, in which a stream of documents (where each document is m...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012